Factor analysis for audio-based video genre classification

نویسندگان

  • Mickael Rouvier
  • Driss Matrouf
  • Georges Linarès
چکیده

Statistical classifiers operate on features that generally include both useful and useless information. These two types of information are difficult to separate in the feature domain. Recently, a new paradigm based on a Latent Factor Analysis (LFA) proposed a model decomposition into usefull and useless components. This method was successfully applied to speaker and language recognition tasks. In this paper, we study the use of LFA for video genre classification by using only the audio channel. We propose a classification method based on short-term cepstral features and Gaussian Mixture Models (GMM) or Support Vector Machine (SVM) classifiers, that are combined with Factor Analysis (FA). Experiments are conducted on a corpus composed of 5 types of video (musics, commercials, cartoons, movies and news). The relative classification error reduction obtained by using the best factor analysis configuration with respect to the baseline system, Gaussian Mixture Model Universal Background Model (GMM-UBM), is about 56%, corresponding to a correct identification rate of about 90%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust audio-based classification of video genre

Video genre classification is a challenging task in a global context of fast growing video collections available on the Internet. This paper presents a new method for video genre identification by audio analysis. Our approach relies on the combination of low and high level audio features. We investigate the discriminative capacity of features related to acoustic instability, speaker interactivi...

متن کامل

Modeling nuisance variabilities with factor analysis for GMM-based audio pattern classification

Audio pattern classification represents a particular statistical classification task and includes, for example, speaker recognition, language recognition, emotion recognition, speech recognition and, recently, video genre classification. The feature being used in all these tasks is generally based on a short-term cepstral representation. The cepstral vectors contain at the same time useful info...

متن کامل

Audio-Visual content description for video genre classification in the context of social media

In this paper we address the automatic video genre classification with descriptors extracted from both, audio (blockbased features) and visual (color and temporal based) modalities. Tests performed on 26 genres from blip.tv media platform prove the potential of these descriptors to this task.

متن کامل

Content-Based Video Description for Automatic Video Genre Categorization

In this paper, we propose an audio-visual approach to video genre categorization. It exploits audio, color, temporal and contour information, which are in general genre specific. Audio information is extracted at block-level, which has the advantage of capturing local temporal information. At temporal level, we asses action contents with respect to human perception. Further, color perception is...

متن کامل

Video genre categorization and representation using audio-visual information

We propose an audio-visual approach to video genre classification using content descriptors that exploit audio, color, temporal, and contour information. Audio information is extracted at blocklevel, which has the advantage of capturing local temporal information. At the temporal structure level, we consider action content in relation to human perception. Color perception is quantified using st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009